THE BINOMIAL IDEAL OF THE INTERSECTION AXIOM 
FOR CONDITIONAL PROBABILITIES 



ALEX FINRi 



Abstract. The binomial ideal associated with the intersection axiom of con- 
ditional probability is shown to be radical and is expressed as an intersection 
of toric prime ideals. This solves a problem in algebraic statistics posed by 
Cartwright and Engstrom. 

Conditional independence contraints are a family of natural constraints on prob- 
ability distributions, describing situations in which two random variables are inde- 
pendently distributed given knowledge of a third. Statistical models built around 
considerations of conditional independence, in particular graphical models in which 
the constraints are encoded in a graph on the random variables, enjoy wide appli- 
cability in determining relationships among random variables in statistics and in 
dealing with uncertainty in artificial intelligence. 

One can take a purely combinatorial perspective on the study of conditional 
independence, as does Studeny jlOj , conceiving of it as a relation on triples of 
subsets of a set of observables which must satisfy certain axioms. A number of 
elementary implications among conditional independence statements are recognised 
as axioms. Among these are the semi-graphoid axioms, which are implications 
of conditional independence statements lacking further hypotheses, and hence are 
purely combinatorial statements. The intersection axiom is also often added to the 
collection, but unlike the semi-graphoid axioms it is not uniformly true; it is our 
subject here. 

Formally, a conditional independence model M. \s a, set of probability distribu- 
tions characterised by satisfying several conditional independence constraints. We 
will work in the discrete setting, where a probability distribution p is a multi-way 
table of probabilities, and we follow the notational conventions in 1 . 

Consider the discrete conditional independence model M given by 
{X^MX2 I X3,Xi I X2} 

where Xi is a random variable taking values in the set [r^] — {1, .... r^}. Through- 
out we assume ri > 2. Let pijk be the unknown probability P{Xi = i, X2 = j, ^3 = 
k) in a distribution from the model The set of distributions in the model M is 
the variety whose defining ideal Im '~= S = C[pijk] is 

Im = {PijkPi'j'k -Ptj'kPi'jk -.i^i'e [ri],j,j' e [r2],k G [rj]) 

+ {Pi]kPi']k' -Pijk'Pi'jk ■ hi' e [ri\,j e [r2],k,k' £ [r3]). 



^ Department of Mathematics, University of California, Berkeley, finkaamath.berkeley.edu. 

1 



2 



ALEX FINK 



The intersection axiom is tlic axiom whose premises are the statements of A4 and 
whose conclusion is Xi JL {X2,X3). This imphcation requires the further hypoth- 
esis that the distribution p is in the interior of the probabihty simplex, i.e. that no 
individual probability pijk is zero. It is thus a natural question to ask what can 
be inferred about distributions p which may lie on the boundary of the probability 
simplex. In algebraic terms, we are asking for a primary decomposition of Im- 

Our Proposition[T]resolves a problem posed by Dustin Cartwright and Alexander 
Engstrom in [U p. 152]. The problem concerned the primary decomposition of Im] 
they conjectured a description in terms of subgraphs of a complete bipartite graph, 
which we show here to be correct. 

In the course of this project the author carried out computations of primary 
decompositions for the ideal Mi for various values of ri, r2, and with the com- 
puter algebra system Singular [H |5] . Thomas Kahle has recently written dedicated 
Macaulay2 code [3 for binomial primary decompositions [7j, in which the same 
computations may be carried out. 

A broad generalisation of this paper's results to the class of binomial edge ideals 
of graphs has been obtained by Herzog, Hibi, Hreinsdottir, Kahle, and Rauh 6J. 

Let Kp q be the complete bipartite graph with bipartitioned vertex set [p] 11 [q]. 
We say that a subgraph G of Kr2,ra is admissible if G has vertex set [r2] 11 [ra] and 
all connected components of G are isomorphic to some complete bipartite graph 
Kp,q with p,q> 1. 

Given a subgraph G with edge set Edges(G), the prime Pg to which it corre- 
sponds is defined to be 

(1) Po=P^^^+P^' 
where 

P^°) = : I e [n], {j, k) i Edges(G)), 

= {PtjkPi'j'k' -Pij'k'Pi'jk ■■i,i' <^ [ri], 

iij' G [''2] and k, k' G [r^] in the same connected component of G). 

Note that j and j', and k and fc', need not be distinct. That is, for (pijk) on the 
variety V{Pg), Pijk = for (j, k) ^ Edges(G'), and any pair of vectors p.jfc and p-j'k' 
are proportional for {j, k) and (j', k') two edges in Edges(G) in the same connected 
component of G. Later we will also want to refer to the individual summands P^"^ 
of Pq \ where P^^ includes only the generators {pijk ■ {j,k) G C} arising from 
edges in the connected component C . 

Proposition 1. The set of minimal primes of the ideal Im is 
{Pg '■ G an admissible graph on [r2\ 11 [rs]}. 

In particular, the value of ri is irrelevant to the combinatorial nature of the 
primary decomposition. 

Proposition [1] was the original conjecture of Cartwright and Engstrom. It is a 
purely set-theoretic assertion, and is equivalent to the fact that 

(2) V{lM) = [jV{PG) 

G 
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as sets, where the union is over admissible graphs G. The ideas of a proof of 
Proposition [T] were anticipated in part 4 of the problem stated in [1, §6.6] which 
was framed for the prime corresponding to the subgraph G, the case where the 
conclusion of the intersection axiom is valid; they extend without great difficulty 
to the general case. 

We will prove a stronger ideal-theoretic result. Let ^dp be the revlex term order 
on S over the lexicographic variable order on subscripts, with earlier subscripts 
more significant: thus under -<dp, we have pm -<dp P112 -<dp P2ii- 

Theorem 2. The primary decomposition 

(3) lM^f]PG 

a 

holds and is an irredundant decomposition, where the union is over admissible 
graphs G on [r2] 11 [r^] . We moreover have 



i^dp Im = in^dp n = n in^dp Pg- 



G G 

Furthermore, each primary component in^^^ Pq is squarefree, so in^^p Im o,iT-d 
hence Im are radical ideals. 

It is noted in 1, §6.6] that the number rj{p, q) of admissible graphs G on [p] If [q] 
is given by the generating function 

(4) exp((e- - l){ey - 1)) = J] 

which in that reference is said to follow from manipulations of Stirling numbers. 
This equation ^ can also be obtained as a direct consequence of a bivariate form 
of the exponential formula for exponential generating functions ^ §5.1], using the 
observation that 



(e^-l)(e^-l)= ^ 



p\q\ 



p.,q>l 

is the exponential generating function for complete bipartite graphs with p,q > 1, 
and these are the possible connected components of admissible graphs. 

We now review some standard facts on binomial and toric ideals [2!. Let / 
be a binomial ideal in C[xi, . . . ,a;„], generated by binomials of the form x"" — 
with v,w G N". There is a lattice Lj C Z" such that the localisation Ix^-.-x,-, ^ 
C[a;f ^, . . . ,x^^] has the form {x"" — 1 : v G L/), provided that this localisation is 
a proper ideal, i.e. / contains no monomial. If 0/ : Z" ^ Z™ is a Z-linear map 
whose kernel is L/, then (/>/ provides a multigrading with respect to which / is 
homogeneous. In statistical terms </>/ computes the minimal sufficient statistics for 
the statistical model associated to /. 

Given a multivariate Laurent polynomial / £ C[a;]^^, . . . , x^ ^], / lies in I^^.. 
if and only if, for each fiber of (/>/, the sum of the coefficients on all monomials 
cc" with w G F is zero. With respect to C[a;i, . . . ,Xn] a modified statement holds, 
as follows. For each fiber F, consider the graph rp{I) whose vertices are the set of 
vectors in F with all entries nonnegative, and whose edge set is {{v, w) : — is 
a monomial multiple of a generator of /}. In the statistical context these edges are 
known as moves. Then / lies in / C C[xi, . . . , x„] if and only if, for each connected 
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component C of each T f{I), the sum of the coefficients on all monomials with 
w € C is zero. In particular / is determined by this set of connected components. 

Viewing / C <C[x^^ , . . . , x^^] as the ideal of the toric subvariety of (C*)" as- 
sociated to the lattice polytope A, Sturmfels in shows that the radicals of the 
monomial initial ideals of / are exactly the Stanley-Reisner ideals of regular trian- 
gulations of A. The Stanley-Reisner ideal I a of a simplicial complex A on a set T 
is the monomial ideal of C[xi : t e T] generated as a vector space by the products 
of variables xt-^ ■ ■ ■ xt^. for which {^i, . . . ,tk} does not contain a face of A. Every 
squarefree monomial ideal is the Stanley-Reisner ideal of some simplicial complex, 
and primary decompositions of Stanley-Reisner ideals are easily described: /a is 
the intersection of the ideals [xt :t ^ F) over all facets F of A. 

Sturmfels also treats explicitly the ideal / of 2 x 2 minors of an r x s matrix 
Y = {uij), of which Pff := Pr^^^^ is a particular case. In this case the polytope A 
is the product of two simplices, A^-i x As_i. 

Theorem 3 ([5]). Let I he the ideal o/2 x 2 minors of an r x s matrix of indeter- 
minates. For any term order ^, in^ I is a squarefree monomial ideal. 

This immediately yields the radicality claim of Theorem[2j the in^ Pq are square- 
free monomial ideals, so their associated primes are generated by subsets of the 
variables {pijk}- 

We repeat from fS] one especially describable example of an initial ideal of this 
ideal /, namely in^^^ /, corresponding to the case that A is the so-called staircase 
triangulation. Then the vertices of the simplices of A correspond to those sets tt 
of entries of the matrix Y which form ("staircase") paths through Y starting at 
the upper-left corner, taking only steps right and down, and terminating at the 
lower left corner. Hence to each such tt corresponds one primary component Qcir, 
generated by all (r — l)(s — 1) indeterminates not lying on tt. Note that staircase 
paths are maximal subsets of indeterminates not including both Xij' and Xi'j for 
any i < i' and j < j' . 

This framework suffices to understand the primary decomposition of in^ Pq for 
an arbitrary admissible graph G. Let the connected components of G be Ci, . . . , C;, 
so that, from ([1]), in^ Pq is the sum of the ideal in^ P^"-* — Pq"^ and the various 
ideals in^ PcK and moreover these summands use disjoint sets of variables. Sup- 
pose that in^ Pq^ = f^j Qdj are primary decompositions of the in^ Pc-^- Then it 
follows that we have the primary decomposition 

in^PG^f](^Pi"^+i2in^Qc,.i^ 

where j = (ji , . . . , ) ranges over the Cartesian product of the index sets in Qc,j ■ 

Proof of Theorem [H We begin by proving that the right side of ([3|) is an irredun- 
dant primary decomposition. Let G be an admissible graph. For each connected 
component G C G and fixed i, P^^ are the determinantal ideal of 2 x 2 minors 
of the matrix with ri rows and columns indexed by Edges(C), whose i, (j, k) entry 
is Pijk- Being a determinantal ideal, p'^'^ is prime. The ideal Pq^ is also prime, 
as it is generated by a collection of variables. Now Pq is the sum of the prime 
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ideals Pq and for each C, and the generators of these primes involve pairwise 
disjoint subsets of the unknowns Pijk- It follows that Pq itself is prime. 

Irredundance is the assertion that for G and G" distinct admissible graphs, Pg 
is not contained in Pc- As above, we will think of the 3-tensor (pijk) as a size 
r2 X ra table whose entries are vectors {p-jk) of length n. Then if (pijk) & V{Pg), 
all nonzero vectors in each subtable determined by a connected component of G 
are proportional, while vectors outside of any subtable must be the zero vector. 
There is an open dense subset Ug C V{Pg) such that for (pijk) G Ug, no vector 
p-jk associated to a connected component of G is zero, and no two associated to 
distinct components are dependent. 

Now, G may differ from G" in two fashions. If G contains an edge (j, k) that G" 
doesn't, the vector (p-jk) is zero on V{Pg') but is nonzero on Ug- hence V{Pg) % 
V(Pg<^- If not, G C G", but two edges (j. A:), {j',k') in different components of G 
must be in the same component of G', in which case the vectors (p-jk) and (jj.jik>) 
are linearly dependent for (pijk) G V{Pg') but linearly independent on Ug- hence 
also V{Pg) % V{Pg'^- This proves irredundance. 

Now we turn to proving Let < be ^dp- Write / = 1m- It is apparent that 
/ C Pg for each G. Indeed, given a generator / of /, without loss of generality 
/ = PijkPvj'k -Pij'kPi'jk, either both edges (j, fc) and {j',k) lie in Edges(G), in 
which case / is a generator of Pg \ or one of these edges is not in Edges(G), in 
which case / G P^*^ . Therefore the containments 

in^/ Cin^ Pi Pg Cpjin^ Pg 

G G 

hold. It now suffices to show an equality of Hilbert functions 
(5) i/(5/in^/) = i/(5/f|in^PG). 

G 

In the present case, the lattice Lj associated to / is generated by all vectors 
of the forms e^fe + Ci'j'k - e^j^k - e^'jk and e^jk + e^^jk' ~ Sijk' - Si'jk- The map 

(j>I : Z'^i''^''^ ^ ^ri+rara sending (Uijk) to 

has kernel L/ and thus induces the multigrading on S by minimal sufficient sta- 
tistics, with respect to which / is homogeneous. In fact the analogue of ^ using 
Hilbert functions in the multigrading is also true, and it is this we will prove. 

Let d € Z''^"'"''^''^ be the multidegree of some monomial, and write its components 
as di for i S [ri] and djk for j, k € [r2] x [r^]. Let G{d) be the bipartite graph with 
vertex set [r2] 11 [r^] and edge set {{j, k) : djk 0}. We now prove the following 
two claims: 

Claim 1- Id = {PG{d))d- 

Claim 2. (PIg™^ ^g)^ = (in^ PG(d))d- 

These claims, and the fact that an ideal and its initial ideal have the same Hilbert 
function, imply 

ff(in^ I){d) = H{I){d) = H{PG(d)){d) = i/(in^ PGid)){d) = H{{^m^ PgM, 

G 
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We conclude that ^ holds, proving Theorem [2l 

Proof of Claim 1. Observe first that no polynomial homogeneous of multidegree d 
can be divisible by any pijk with (j, A:) ^ Edges(G(d)). Accordingly we have 
{PG(d))d — (^G(!i))<i, in the notation of ([T]), and we will work with ^g^^^j) hereafter. 

Since / and Sire binomial ideals generated by differences of monomials, it 

will suffice to show that the two graphs Tf{I) and ^f{Pq^j^^) of moves on the fiber 
F = (f)'J^{d) have the same partition into connected components. The refinement 
in one direction is clear: Tp{I) is a subgraph of T p{p'^^^~^), since Id C {PG(d))d — 
iPQ^d)^d, and indeed each generator of / of multidegree at most d is a monomial 
multiple of a generator of Pq^j.)- 

So given an edge of ^piPQ^^)^, we must show that this edge is contained in 
a connected component of Tp(I). Let u,u' ^ F be the endpoints of an edge of 
TpiPQ^^))- Then u = u' + etjk + eej^k' - ei^k' - ei'jk for some e [n] and 
{j,k), {j',k') edges of G{d) in the same component. By connectedness, there is a 
path of edges eg = (j', fc'), ei, . . . , e/ = (j, k) of G{d) such that and e^+i share 
a vertex for each i. Corresponding to this path there exists a sequence of moves 
{Mm)m=Q,...,i-i in /, say Mm = p"" — p^^+i, where mq = m', ui = u, and where 
Mm is a monomial multiple of 

for some im & [ri\. So u and u' are in a single connected component of T f{I)- 
Proof of Claim 2. Again, one containment is straightforward, namely in^ PG(d) Q 
Plgin^ Pa- There is an admissible graph G such that Pq Q Paid)- Such a G can 
be constructed per the discussion of irredundance, if we take p to be a generic point 
of V{PG{d))- Then in^ Pq C in ^ Poid) and this latter initial ideal is one of the 
ideals being intersected in p|g Pq- 

For the other containment, let G be any connected bipartite graph on vertex set 
[r'2]n[r'3], such that djk — for (j, fc) ^ E{G). By the Stanley- Reisner description of 
the initial ideal for -<dpj a monomial p" S S* of degree d lies in in^ Pc = in^ ^^rarg 
if and only if is divisible by Pij'k'Pi'jk for some i < i' and (j, fc) < (j', fc') 
lexicographically. 

So if p" is a monomial of multidegree d lying in in^^^ PG(d)j it's divisible by 
some Pij'k'Pi'jk with i < i' m [ri] and (j, fc) < (j', fc') two edges lying in the same 
connected component of G(d); it cannot occur that instead is divisible by some 
indeterminate Pij"k" for {j",k") not an edge of G{d), since Pu has multidegree d. 
Now let G be any admissible graph. If G{d) is not a subset of G, then p" is divisible 
by some indeterminate Pij"k" with {j",k") ^ E{G), so p" G in^ Pg- Otherwise 
G((i) C G. In this case the edges (j, fc) and (j', fc') lie in the same component of G, 
and so Pij'k'Pi'jk \ p" implies p" G in^ Pg again. Therefore in^ PG(d) 2 Hg ^^-^ Pg- 

□ 

We close with the remark that we can describe explicitly which components 
of dH]) contain a given point of V{Im)- Let p = {pijk) G C^''^''^, and define G(p) 
to be the bipartite graph on [r2] 11 [rs] with edge set {(j, fc) : pijk 7^ for some i}. 
Then the components V{Pg) containing (pyfc) are exactly those for which G can be 
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obtained from G{p) by adding edges which don't unite two connected components 
of the latter containing respective edges (j, k) and (/, k') such that p-jk and p-j'k' 
are not proportionaL If p G UQ(^p-), then these components are exactly those for 
which G adds only edges which don't unite two connected components of G{p), 
neither of which is an isolated vertex. 
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